Computing inter-rater reliability and its variance in the presence of high agreement.

نویسنده

Kilem Li Gwet

چکیده

Pi (pi) and kappa (kappa) statistics are widely used in the areas of psychiatry and psychological testing to compute the extent of agreement between raters on nominally scaled data. It is a fact that these coefficients occasionally yield unexpected results in situations known as the paradoxes of kappa. This paper explores the origin of these limitations, and introduces an alternative and more stable agreement coefficient referred to as the AC1 coefficient. Also proposed are new variance estimators for the multiple-rater generalized pi and AC1 statistics, whose validity does not depend upon the hypothesis of independence between raters. This is an improvement over existing alternative variances, which depend on the independence assumption. A Monte-Carlo simulation study demonstrates the validity of these variance estimators for confidence interval construction, and confirms the value of AC1 as an improved alternative to existing inter-rater reliability statistics.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Test-Retest and Inter-Rater Reliability Study of the Schedule for Oral-Motor Assessment in Persian Children

Objectives: Reliable and valid clinical tools to screen, diagnose, and describe eating functions and dysphagia in children are highly warranted. Today most specialists are aware of the role of assessment scales in the treatment of affected individuals. However, the problem is that the clinical tools used might be nonstandard, and worldwide, there is no integrated assessment performed to assess ...

متن کامل

Functional Movement Screen in Elite Boy Basketball Players: A Reliability Study

Purpose: To investigate the reliability of Functional Movement Screen (FMS) in basketball players. A few studies have compared the reliability of FMS between raters with different experience in athletes. The purpose of this study was to compare the FMS scoring between the beginners and expert raters using video records.  Methods: This is a cross-sectional study. The study subjects compris...

متن کامل

Comparison between inter-rater reliability and inter-rater agreement in performance assessment.

INTRODUCTION Over the years, performance assessment (PA) has been widely employed in medical education, Objective Structured Clinical Examination (OSCE) being an excellent example. Typically, performance assessment involves multiple raters, and therefore, consistency among the scores provided by the auditors is a precondition to ensure the accuracy of the assessment. Inter-rater agreement and i...

متن کامل

ارزیابی خطرات محیطی با استفاده از پایایی نسخه فارسی شده ابزار غربالگری زمین خوردن و حوادث در منزل در سالمندان ایرانی

Introduction: one of the common problems among older people is falling. Falling inside the houses and streets makes up a large incidence between Iranian elderly, then the effort to identify environmental factors at home and home modification can reduce falls and injury in the elderly. The aim of this study was identifying elderly at risk of fall with using screening tool (HOME FAST) and deter...

متن کامل

Nurse-Physician Agreement on Triage Category: A Reliability Analysis of Emergency Severity Index

Background and Objectives: MThe Emergency Severity Index (ESI) triage is commonly used in clinical settings to determine the patients’ emergency severity. However, the reliability of this index is not sufficiently explored. The present study examines the inter-rater reliability of ESI by comparing triage ratings as performed by nurses and physicians. Methods: This prospective cross-sectional st...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

The British journal of mathematical and statistical psychology

دوره 61 Pt 1 شماره

صفحات -

تاریخ انتشار 2008

Computing inter-rater reliability and its variance in the presence of high agreement.

نویسنده

چکیده

منابع مشابه

Test-Retest and Inter-Rater Reliability Study of the Schedule for Oral-Motor Assessment in Persian Children

Functional Movement Screen in Elite Boy Basketball Players: A Reliability Study

Comparison between inter-rater reliability and inter-rater agreement in performance assessment.

ارزیابی خطرات محیطی با استفاده از پایایی نسخه فارسی شده ابزار غربالگری زمین خوردن و حوادث در منزل در سالمندان ایرانی

Nurse-Physician Agreement on Triage Category: A Reliability Analysis of Emergency Severity Index

عنوان ژورنال:

اشتراک گذاری